Everything Totally Explained


Ask & we'll explain, totally!
Linked list
Totally Explained


  NEW! All the latest news in the worlds of computer gaming, entertainment, the environment,  
finance, health, politics, science, stocks & shares, technology and much, much, more.  


View this entry using RSS

Everything about Linked Lists totally explained

In computer science, a linked list is one of the fundamental data structures, and can be used to implement other data structures. It consists of a sequence of nodes, each containing arbitrary data fields and one or two references ("links") pointing to the next and/or previous nodes. The principal benefit of a linked list over a conventional array is that the order of the linked items may be different from the order that the data items are stored in memory or on disk, allowing the list of items to be traversed in a different order. A linked list is a self-referential datatype because it contains a pointer or link to another datum of the same type. Linked lists permit insertion and removal of nodes at any point in the list in constant time, but don't allow random access. Several different types of linked list exist: singly-linked lists, doubly-linked lists, and circularly-linked lists.
   Linked lists can be implemented in most languages. Languages such as Lisp and Scheme have the data structure built in, along with operations to access the linked list. Procedural or object-oriented languages such as C, C++, and Java typically rely on mutable references to create linked lists.

History

Linked lists were developed in 1955-56 by Allen Newell, Cliff Shaw and Herbert Simon at RAND Corporation as the primary data structure for their Information Processing Language. IPL was used by the authors to develop several early artificial intelligence programs, including the Logic Theory Machine, the General Problem Solver, and a computer chess program. Reports on their work appeared in IRE Transactions on Information Theory in 1956, and several conference proceedings from 1957-1959, including Proceedings of the Western Joint Computer Conference in 1957 and 1958, and Information Processing (Proceedings of the first UNESCO International Conference on Information Processing) in 1959. The now-classic diagram consisting of blocks representing list nodes with arrows pointing to successive list nodes appears in "Programming the Logic Theory Machine" by Newell and Shaw in Proc. WJCC, February 1957. Newell and Simon were recognized with the ACM Turing Award in 1975 for having "made basic contributions to artificial intelligence, the psychology of human cognition, and list processing".
   The problem of machine translation for natural language processing led Victor Yngve at Massachusetts Institute of Technology (MIT) to use linked lists as data structures in his COMIT programming language for computer research in the field of linguistics. A report on this language entitled "A programming language for mechanical translation" appeared in Mechanical Translation in 1958. LISP, standing for list processor, was created by John McCarthy in 1958 while he was at MIT and in 1960 he published its design in a paper in the Communications of the ACM, entitled "Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I". One of LISP's major data structures is the linked list.
   By the early 1960s, the utility of both linked lists and languages which use these structures as their primary data representation was well established. Bert Green of the MIT Lincoln Laboratory published a review article entitled "Computer languages for symbol manipulation" in IRE Transactions on Human Factors in Electronics in March 1961 which summarized the advantages of the linked list approach. A later review article, "A Comparison of list-processing computer languages" by Bobrow and Raphael, appeared in Communications of the ACM in April 1964.
   Several operating systems developed by Technical Systems Consultants (originally of West Lafayette Indiana, and later of Chapel Hill, North Carolina) used singly linked lists as file structures. A directory entry pointed to the first sector of a file, and succeeding portions of the file were located by traversing pointers. Systems using this technique included Flex (for the Motorola 6800 CPU), mini-Flex (same CPU), and Flex9 (for the Motorola 6809 CPU). A variant developed by TSC for and marketed by Smoke Signal Broadcasting in California, used doubly linked lists in the same manner.
   The TSS operating system, developed by IBM for the System 360/370 machines, used a double linked list for their file system catalog. The directory structure was similar to Unix, where a directory could contain files and/or other directories and extend to any depth. A utility flea was created to fix file system problems after a crash, since modified portions of the file catalog were sometimes in memory when a crash occurred. Problems were detected by comparing the forward and backward links for consistency. If a forward link was corrupt, then if a backward link to the infected node was found, the forward link was set to the node with the backward link. A humorous comment in the source code where this utility was invoked stated "Everyone knows a flea caller gets rid of bugs in cats".

Types of linked lists

Linearly-linked list

Singly-linked list

The simplest kind of linked list is a singly-linked list (or slist for short), which has one link per node. This link points to the next node in the list, or to a null value or empty list if it's the final node.

A singly-linked list containing two values: the value of the current node and a link to the next node
A singly linked list's node is divided into two parts. The first part holds or points to information about the node, and second part holds the address of next node. A singly linked list travels one way.

Doubly-linked list

A more sophisticated kind of linked list is a doubly-linked list or two-way linked list. Each node has two links: one points to the previous node, or points to a null value or empty list if it's the first node; and one points to the next, or points to a null value or empty list if it's the final node.

A doubly-linked list containing three integer values: the value, the link forward to the next node, and the link backward to the previous node
In some very low level languages, XOR-linking offers a way to implement doubly-linked lists using a single word for both links, although the use of this technique is usually discouraged.

Circularly-linked list

In a circularly-linked list, the first and final nodes are linked together. This can be done for both singly and doubly linked lists. To traverse a circular linked list, you begin at any node and follow the list in either direction until you return to the original node. Viewed another way, circularly-linked lists can be seen as having no beginning or end. This type of list is most useful for managing buffers for data ingest, and in cases where you've one object in a list and wish to see all other objects in the list.
   The pointer pointing to the whole list may be called the access pointer.

A circularly-linked list containing three integer values

Sentinel nodes

Linked lists sometimes have a special dummy or sentinel node at the beginning and/or at the end of the list, which isn't used to store data. Its purpose is to simplify or speed up some operations, by ensuring that every data node always has a previous and/or next node, and that every list (even one that contains no data elements) always has a "first" and "last" node. Lisp has such a design - the special value nil is used to mark the end of a 'proper' singly-linked list, or chain of cons cells as they're called. A list doesn't have to end in nil, but a list that didn't would be termed 'improper'.

Applications of linked lists

Linked lists are used as a building block for many other data structures, such as stacks, queues and their variations.
   The "data" field of a node can be another linked list. By this device, one can construct many linked data structures with lists; this practice originated in the Lisp programming language, where linked lists are a primary (though by no means the only) data structure, and is now a common feature of the functional programming style.
   Sometimes, linked lists are used to implement associative arrays, and are in this context called association lists. There is very little good to be said about this use of linked lists; they're easily outperformed by other data structures such as self-balancing binary search trees even on small data sets (see the discussion in associative array). However, sometimes a linked list is dynamically created out of a subset of nodes in such a tree, and used to more efficiently traverse that set.

Tradeoffs

As with most choices in computer programming and design, no method is well suited to all circumstances. A linked list data structure might work well in one case, but cause problems in another. This is a list of some of the common tradeoffs involving linked list structures. In general, if you've a dynamic collection, where elements are frequently being added and deleted, and the location of new elements added to the list is significant, then benefits of a linked list increase.

Linked lists vs. arrays

Array Linked list
Indexing O(1) O(n)
Inserting / Deleting at end O(1) O(1) or O(n)
Inserting / Deleting in middle (with iterator) O(n) O(1)
Persistent No Singly yes
Locality Great Bad
Linked lists have several advantages over arrays. Elements can be inserted into linked lists indefinitely, while an array will eventually either fill up or need to be resized, an expensive operation that may not even be possible if memory is fragmented. Similarly, an array from which many elements are removed may become wastefully empty or need to be made smaller.
   Further memory savings can be achieved, in certain cases, by sharing the same "tail" of elements among two or more lists — that is, the lists end in the same sequence of elements. In this way, one can add new elements to the front of the list while keeping a reference to both the new and the old versions — a simple example of a persistent data structure.
   On the other hand, arrays allow random access, while linked lists allow only sequential access to elements. Singly-linked lists, in fact, can only be traversed in one direction. This makes linked lists unsuitable for applications where it's useful to look up an element by its index quickly, such as heapsort. Sequential access on arrays is also faster than on linked lists on many machines due to locality of reference and data caches. Linked lists receive almost no benefit from the cache.
   Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values. It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools.
   A number of linked list variants exist that aim to ameliorate some of the above problems. Unrolled linked lists store several elements in each list node, increasing cache performance while decreasing memory overhead for references. CDR coding does both these as well, by replacing references with the actual data referenced, which extends off the end of the referencing record.
   A good example that highlights the pros and cons of using arrays vs. linked lists is by implementing a program that resolves the Josephus problem. The Josephus problem is an election method that works by having a group of people stand in a circle. Starting at a predetermined person, you count around the circle n times. Once you reach nth person, take them out of the circle and have the members close the circle. Then count around the circle the same n times and repeat the process, until only one person is left. That person wins the election. This shows the strengths and weaknesses of a linked list vs. an array, because if you view the people as connected nodes in a circular linked list then it shows how easily the linked list is able to delete nodes (as it only has to rearrange the links to the different nodes). However, the linked list will be poor at finding the next person to remove and will need to recurse through the list until it finds that person. An array, on the other hand, will be poor at deleting nodes (or elements) as it can't remove one node without individually shifting all the elements up the list by one. However, it's exceptionally easy to find the nth person in the circle by directly referencing them by their position in the array.
   The list ranking problem concerns the efficient conversion of a linked list representation into an array. Although trivial for a conventional computer, solving this problem by a parallel algorithm is complicated and has been the subject of much research.

Doubly-linked vs. singly-linked

Double-linked lists require more space per node (unless one uses xor-linking), and their elementary operations are more expensive; but they're often easier to manipulate because they allow sequential access to the list in both directions. In particular, one can insert or delete a node in a constant number of operations given only that node's address. (Compared with singly-linked lists, which require the previous node's address in order to correctly insert or delete.) Some algorithms require access in both directions. On the other hand, they don't allow tail-sharing, and can't be used as persistent data structures.

Circularly-linked vs. linearly-linked

Circular linked lists are most useful for describing naturally circular structures, and have the advantage of regular structure and being able to traverse the list starting at any point. They also allow quick access to the first and last records through a single pointer (the address of the last element). Their main disadvantage is the complexity of iteration, which has subtle special cases.

Sentinel nodes (header nodes)

Doubly linked lists can be structured without using a front and NULL pointer to the ends of the list. Instead, a node of object type T set with specified default values is used to indicate the "beginning" of the list. This node is known as a Sentinel node and is commonly referred to as a "header" node. Common searching and sorting algorithms are made less complicated through the use of a header node, as every element now points to another element, and never to NULL. The header node, like any other, contains a "next" pointer that points to what is considered by the linked list to be the first element. It also contains a "previous" pointer which points to the last element in the linked list. In this way, a doubly linked list structured around a Sentinel Node is circular.
   The Sentinel node is defined as another node in a doubly linked list would be, but the allocation of a front pointer is unnecessary as the next and previous pointers of the Sentinel node will point to itself. This is defined in the default constructor of the list.
   next == this; prev == this;
   If the previous and next pointers point to the Sentinel node, the list is considered empty. Otherwise, if one or more elements is added, both pointers will point to another node, and the list will contain those elements.
   Sentinel node may simplify certain list operations, by ensuring that the next and/or previous nodes exist for every element. However sentinel nodes use up extra space (especially in applications that use many short lists), and they may complicate other operations. To avoid the extra space requirement the sentinel nodes can often be reused as references to the first and/or last node of the list.
   The Sentinel node eliminates the need to keep track of a pointer to the beginning of the list, and also eliminates any errors that could result in the deletion of the first pointer, or any accidental relocation.

Linked list operations

When manipulating linked lists in-place, care must be taken to not use values that you've invalidated in previous assignments. This makes algorithms for inserting or deleting linked list nodes somewhat subtle. This section gives pseudocode for adding or removing nodes from singly, doubly, and circularly linked lists in-place. Throughout we'll use null to refer to an end-of-list marker or sentinel, which may be implemented in a number of ways.

Linearly-linked lists

Singly-linked lists

Our node data structure will have two fields. We also keep a variable firstNode which always points to the first node in the list, or is null for an empty list.
    record Node
   Notice that when using external storage, an extra step is needed to extract the record from the node and cast it into the proper data type. This is because both the list of families and the list of members within the family are stored in two linked lists using the same data structure (node), and this language doesn't have parametric types.
   As long as the number of families that a member can belong to is known at compile time, internal storage works fine. If, however, a member needed to be included in an arbitrary number of families, with the specific number known only at run time, external storage would be necessary.

Speeding up search

Finding a specific element in a linked list, even if it's sorted, normally requires O(n) time (linear search). This is one of the primary disadvantages of linked lists over other data structures. In addition to some of the variants discussed in the above section, there are number of simple ways of improving search time.
   In an unordered list, one simple heuristic for decreasing average search time is the move-to-front heuristic, which simply moves an element to the beginning of the list once it's found. This scheme, handy for creating simple caches, ensures that the most recently used items are also the quickest to find again.
   Another common approach is to "index" a linked list using a more efficient external data structure. For example, one can build a red-black tree or hash table whose elements are references to the linked list nodes. Multiple such indexes can be built on a single list. The disadvantage is that these indexes may need to be updated each time a node is added or removed (or at least, before that index is used again).

Related data structures

Both stacks and queues are often implemented using linked lists, and simply restrict the type of operations which are supported.
   The skip list is a linked list augmented with layers of pointers for quickly jumping over large numbers of elements, and then descending to the next layer. This process continues down to the bottom layer, which is the actual list.
   A binary tree can be seen as a type of linked list where the elements are themselves linked lists of the same nature. The result is that each node may include a reference to the first node of one or two other linked lists, which, together with their contents, form the subtrees below that node.
   An unrolled linked list is a linked list in which each node contains an array of data values. This leads to improved cache performance, since more list elements are contiguous in memory, and reduced memory overhead, because less metadata needs to be stored for each element of the list.
   A hash table may use linked lists to store the chains of items that hash to the same position in the hash table.
   A heap shares some of the ordering properties of a linked list, but is almost always implemented using an array. Instead of references from node to node, the next and previous data indexes are calculated using the current data's index.

Further Information

Get more info on 'Linked Lists'.


External Link Exchanges

Do you know how hard it is to get a link from a large encyclopaedia? Well we're different and will prove it. To get a link from us just add the following HTML to your site on a relevant page:

    <a href="http://linked_list.totallyexplained.com">Linked list Totally Explained</a>

Then simply click through this link from your web page. Our crawlers will verify your link, extract the title of your web page and instantly add a link back to it. If you like you can remove the words Totally Explained and embed the link in article text.
   As long as your link remains in place, we'll keep our link to you right here. Please play fair - our crawlers are watching. Your site must be closely related to this one's topic. Any kind of spamming, dubious practises or removing the link will result in your link from us being dropped and, potentially, your whole site being banned.



Copyright © 2007-8 totallyexplained.com | Licensed under the GNU Free Documentation License | Site Map
This article contains text from the Wikipedia article Linked list (History) and is released under the GFDL | RSS Version